Hybrid syllable/triphone speech synthesis

نویسندگان

Jindrich Matousek

Zdenek Hanzlícek

Daniel Tihelka

چکیده

In this paper, the syllable, an alternative phonetic unit to the phone, is researched in the context of speech synthesis. Several approaches to syllable modelling within the statistical approach (using hidden Markov models) to the acoustic unit inventory creation are proposed and evaluated. To be able to synthesize an arbitrary text, the syllable inventories were supplemented with triphones resulting in hybrid syllable/triphone inventories. Listening tests were accomplished both to assess the quality of the resulting synthetic speech produced using the hybrid syllable/triphone inventories and to choose the best approach to syllable modelling. The resulting synthetic speech is highly intelligible and fluent. Although the synthetic speech generated using the baseline triphone inventory was assessed slightly better, the results of the very first experiments with syllable modelling are very promising.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syllable-based and hybrid acoustic models for Amharic speech recognition

This paper presents the results of our experiments on the use of hybrid acoustic units in speech recognition and the use of syllable and hybrid acoustic models (AM) in morphemebased speech recognition. Although hybrid AMs did not bring improvement in speech recognition performance when words are used as dictionary entries and units in a language model (LM), we observed a significant word error ...

متن کامل

Text-to-audio-visual speech synthesis based on parameter generation from HMM

This paper describes a technique for synthesizing auditory speech and lip motion from an arbitrary given text. The technique is an extension of the visual speech synthesis technique based on an algorithm for parameter generation from HMM with dynamic features. Audio and visual features of each speech unit are modeled by a single HMM. Since both audio and visual parameters are generated simultan...

متن کامل

Syllable-based acoustic modeling for Japanese spontaneous speech recognition

We study on a syllable-based acoustic modeling method for Japanese spontaneous speech recognition. Traditionally, mora-based acoustic models have been adopted for Japanese read speech recognition systems. In this paper, syllable-based unit and mora-based unit are clearly distinguished in their definition, and syllables are shown to be more suitable as an acoustic model for Japanese spontaneous ...

متن کامل

Effect of Prosodic Structure on Segmental Variants

There is a large amount of segmental variants in a natural speech corpus. It is very important to label those variants correctly for a corpus based TTS system. We successfully applied automatic triphone segmentation to a large speech corpus with syllable segmentation and prosodic annotation. In this paper, we also report (1) recognition error analysis based on prosodic structure, and (2) the re...

متن کامل

A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features

This paper describes an improved algorithm, motivated by fuzzy logic theory, for the selection of speech segments for concatenative synthesis from a huge database. Triphone HMM clustering is employed as an adaptive measure for articulatory similarity within a given database. Stress level contours are evaluated in the context of their surrounding vocalic peaks. The algorithm uses a beam search t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Hybrid syllable/triphone speech synthesis

نویسندگان

چکیده

منابع مشابه

Syllable-based and hybrid acoustic models for Amharic speech recognition

Text-to-audio-visual speech synthesis based on parameter generation from HMM

Syllable-based acoustic modeling for Japanese spontaneous speech recognition

Effect of Prosodic Structure on Segmental Variants

A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features

عنوان ژورنال:

اشتراک گذاری